A Novel Resampling Method for Variable Selection in Robust Regression

نویسنده

  • Zafar Mahmood
چکیده

Variable selection in regression analysis is of vital importance for data analyst and researcher to fit the parsimonious regression model. With the inundation of large number of predictor variables and large data sets requiring analysis and empirical modeling, contamination becomes usual problem. Accordingly, robust regression estimators are designed to easily fit contaminated data sets. In the last three decades much work have been done regarding various robust regression methods to dealt the data sets contaminated with outliers, relatively less attentions was given to construct a best subset of the predictor variables in robust regression model. We initially considered crossvalidation resampling technique working well for variable selection in linear regression models; see Zafar and Salahuddin (2009, 2011). It turned out that the usual prediction errors inflated by outlier are not the reliable measure for robust model selection. Ultimately, a novel resampling procedure is proposed by introducing alternative and robust prediction error based on Winsor principle in the contaminated model. We demonstrate that superior results for robust model selection are obtainable by relaxing the requirement for the absolute minimum Winsorized prediction error while using our proposed optimum choice of the tuning constant. The simulation study reveals that the proposed technique is working well.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NOVEL RESAMPLING METHODS FOR TUNING PARAMETER SELECTION IN ROBUST SPARSE REGRESSION MODELING by

The robust lasso-type regularized regression is a useful tool for simultaneous estimation and variable selection even in the presence of outliers. Crucial issues in the robust modeling procedure include the selection of regularization parameters and also a tuning constant in outlier detection. Although the performance of the robust sparse regression strongly depends on the proper choice of thes...

متن کامل

Fuzzy Robust Regression Analysis with Fuzzy Response Variable and Fuzzy Parameters Based on the Ranking of Fuzzy Sets

‎Robust regression is an appropriate alternative for ordinal regression when outliers exist in a given data set‎. ‎If we have fuzzy observations‎, ‎using ordinal regression methods can't model them; In this case‎, ‎using fuzzy regression is a good method‎. ‎When observations are fuzzy and there are outliers in the data sets‎, ‎using robust fuzzy regression methods are appropriate alternatives‎....

متن کامل

A novel approach in robust group decision making for supply strategic planning

Long-term planning is a challenging process for dealing with problems in big industries. Quick and flexible process of responding to the existing variable requirements are considered in such problems. Some of important strategic decisions which should be made in this field are, namely the way that manufacturing facilities should be applied as well as assignment and design the system of delivery...

متن کامل

A Comparison between New Estimation and variable Selectiion method in Regression models by Using Simulation

In this paper some new methods whitch very recently have been introduced for parameter estimation and variable selection in regression models are reviewd. Furthermore , we simulate several models in order to evaluate the performance of these methods under diffrent situation. At last we compare the performance of these methods with that of the regular traditional variable selection methods such ...

متن کامل

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015